規劃Infrastructure的使用方式

2023 iThome 鐵人賽

DAY 16

自我挑戰組

2023年度學習分享系列第 16 篇

15th鐵人賽

kevinyay945

2023-09-16 12:51:46

213 瀏覽

分享至

文章未來將更新於:
https://kevinyay945.com/golang-project-design/anki-support/design-for-infrastructure/

Anki

在Anki這邊，我選擇了anki-connect這個工具
因為這個工具可以將anki透過restful api的方式來跟anki內部的卡片作互動，讓我們可以更簡單的使用

另外，在閱讀文件的時候發現，他還提供可以讓你抓到你使用anki gui的目前狀態的api，就可以做到下面這些事項可以取得你目前在anki瀏覽選取的內容的card id，這關於ui互動的部分我覺得就可以用這個取代了，畢竟ank原本的ui一定是最佳首選

但是有一個比較不方便的點是
他沒有辦法修正你現在正在瀏覽的card的資料，這點就比較不方便的
附上討論的issue
https://github.com/FooSoft/anki-connect/issues/82

這個方法似乎有解套的辦法
https://github.com/FooSoft/anki-connect/issues/82#issuecomment-1221895385

關於音檔的部分，觀察完文件後，發現如果要使用的話，方法如下

透過這組api取得聲音路徑
將你產生好的檔案放到這個路徑下
在你需要的card的欄位上透過以下格式填上你的檔案名稱
[sound:xxxxx.mp3]

如此一來就可以被視為聲音正常播放了

Open ai

透過openai的playground，就可以快速的產生出簡單的使用方法
https://platform.openai.com/playground
這是我這次使用的promps

curl https://api.openai.com/v1/chat/completions \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer $OPENAI_API_KEY" \
  -d '{
  "model": "gpt-3.5-turbo",
  "messages": [
    {
      "role": "system",
      "content": "您現在是一個日文老師\n我會提供你一個日文的單詞\n要請你為一個JLPT程度為N5的同學造句，並在漢字後面附上相對應的平假名(請不要隨意的拆解我提供的單詞)\n也在造句的後面附上繁體中文的翻譯\n\n另外，最後我會附上曾經背過的日文單詞，在接下來的造句中，請盡可能的使用這些單詞\n\n接下來的所有回應請用日文來進行\n\n以下為曾經背過的日文單詞 \n\n両親,月餅,電池,彼の方,お兄さん,高速バス,お姉さん,映画,前,乗り場,冷蔵庫,学校,明後日,チケット,番線"
    },
    {
      "role": "user",
      "content": "みかん"
    },
    {
      "role": "assistant",
      "content": "彼女はみかんを食べています。\n彼女[かのじょ]はみかんを 食[た]べています。\n她正在吃橘子。"
    },
    {
      "role": "user",
      "content": "机"
    },
    {
      "role": "assistant",
      "content": "私の机は木製です。\n私[わたし]の 机[つくえ]は 木製[もくせい]です。\n我的桌子是木制的。"
    },
    {
      "role": "user",
      "content": "パソコン"
    },
    {
      "role": "assistant",
      "content": "私はパソコンで日本語を勉強します。\n私[わたし]はパソコンで日本語[にほんご]を勉強[べんきょう]します。\n我用電腦學習日語。"
    },
    {
      "role": "user",
      "content": "携帯"
    },
    {
      "role": "assistant",
      "content": "彼は携帯で友達とメッセージを送っています。\n彼[かれ]は携帯[けいたい]で友達[ともだち]とメッセージを送[おく]っています。\n他正在用手機和朋友發送訊息。"
    }
  ],
  "temperature": 0,
  "max_tokens": 256,
  "top_p": 1,
  "frequency_penalty": 0,
  "presence_penalty": 0
}'

另外，會在前面的system填上存在於anki內部的單字

text to speech

這次所選擇的是google的text to speech
https://cloud.google.com/text-to-speech/docs/quickstarts?hl=en

使用方法跟之前的google drive api 的方法類似，步驟如下

啟用text to speech
建立一個service account，並產生一個access_token
將token放入程式中

package main  
  
import (  
"context"  
"fmt"  
"io/ioutil"  
"log"  
"google.golang.org/api/option"
  
texttospeech "cloud.google.com/go/texttospeech/apiv1"  
"cloud.google.com/go/texttospeech/apiv1/texttospeechpb"  
)  
  
func main() {  
// Instantiates a client.  
ctx := context.Background()  
  
client, err := texttospeech.NewClient(ctx, option.WithCredentialsJSON([]byte("YOUR_GCP_TOKEN")))  
if err != nil {  
log.Fatal(err)  
}  
defer client.Close()  
  
// Perform the text-to-speech request on the text input with the selected  
// voice parameters and audio file type.  
req := texttospeechpb.SynthesizeSpeechRequest{  
// Set the text input to be synthesized.  
Input: &texttospeechpb.SynthesisInput{  
InputSource: &texttospeechpb.SynthesisInput_Text{Text: "Hello, World!"},  
},  
// Build the voice request, select the language code ("en-US") and the SSML  
// voice gender ("neutral").  
Voice: &texttospeechpb.VoiceSelectionParams{  
LanguageCode: "ja-JP",  
Name: "ja-JP-Wavenet-B",  
},  
// Select the type of audio file you want returned.  
AudioConfig: &texttospeechpb.AudioConfig{  
AudioEncoding: texttospeechpb.AudioEncoding_MP3,  
},  
}  
  
resp, err := client.SynthesizeSpeech(ctx, &req)  
if err != nil {  
log.Fatal(err)  
}  
  
// The resp's AudioContent is binary.  
filename := "output.mp3"  
err = ioutil.WriteFile(filename, resp.AudioContent, 0644)  
if err != nil {  
log.Fatal(err)  
}  
fmt.Printf("Audio content written to file: %v\n", filename)  
}

如此一來，就可以得到一個output.mp3的輸出了

另外，這次我選擇的語言模型是WaveNet，因為這個有400萬位元的免費額度，他的費用計算方式就放在下面連結，提供參考
https://cloud.google.com/text-to-speech/pricing?hl=zh-tw

補充:
後來我的語言模型還是選擇了Neural2，因為WaveNet的重音表現有時候還是會有點問題，之後大家可以將模型切換一下，感受其中的差異